写在前面
嗯,整理K8s中pod调度相关笔记,这里分享给小伙伴
博文内容涉及:
kube-scheduler
组件的简述
Pod
的调度(选择器、指定节点、主机亲和性)方式
节点的coedon
与drain
标记
节点的taint
(污点)标记及pod的容忍污点(tolerations
)定义
食用方式:
需要了解K8s
基础知识
熟悉资源对象pod,deploy
的创建,了解资源对象定义yaml
文件
了解kubectl
常用命令
理解不足小伙伴帮忙指正
所谓成功就是用自己的方式度过人生。——–《明朝那些事》
Scheduler 调度组件简述 Kubernetes Scheduler是什么 众多周知,Kubernetes Scheduler
是 Kubernetes
中负责Pod调度
的重要功能模块,运行在k8s 集群中的master节点。
作用 : Kubernetes Scheduler
的作用是将待调度的Pod (API新创建
的Pod, Controller Manager为补足副本而创建
的Pod等)按照特定的调度算法
和调度策略
绑定(Binding)到集群中某个合适的Node上,并将绑定信息写入etcd中。
在整个调度过程中涉及三个对象,分别是
待调度Pod列表
可用Node列表
以及调度算法和策略
整体流程 :通过调度算法调度,为待调度Pod列表
中的每个Pod从Node列表
中选择一个最适合的Node
随后, 目标node节点上的kubelet
通过APIServer监听到Kubernetes Scheduler
产生的Pod绑定事件
,然后获取对应的Pod清单
,下载Image镜像并启动容器
。
kubelet
进程通过与API Server
的交互,每隔一个时间周期,就会调用一次API Server的REST接口报告自身状态, API Server接收到这些信息后,将节点状态信息更新到etcd中。 同时kubelet也通过API Server的Watch接口监听Pod信息
,
如果监听到新的Pod副本被调度绑定到本节点,则执行Pod对应的容器的创建和启动逻辑;
如果监听到Pod对象被删除,则删除本节点上的相应的Pod容器;
如果监听到修改Pod信息,则kubelet监听到变化后,会相应地修改本节点的Pod容器。
所以说,kubernetes Schedule
在整个系统中承担了承上启下的
重要功能,对上负责接收声明式API或者控制器创建新pod的消息,并且为其安排一个合适的Node,对下,选择好node之后,把工作交接给node上的kubelet,由kubectl负责pod的剩余生命周期。
Kubernetes Scheduler调度流程 Kubernetes Scheduler
当前提供的默认调度流程分为以下两步。这部分随版本一直变化,小伙伴以官网为主
流程
描述
预选调度过程
即遍历所有目标Node,筛选出符合要求的候选节点。为此, Kubernetes内置了多种预选策略(xxx Predicates)供用户选择
确定最优节点
在第1步的基础上,采用优选策略(xxxPriority)计算出每个候选节点的积分,积分最高者胜出
Kubernetes Scheduler的调度流程是通过插件方式加载的"调度算法提供者” (AlgorithmProvider)
具体实现的。一个AlgorithmProvider
其实就是包括了一组预选策略
与一组优先选择策略
的结构体.
Scheduler中可用的预选策略包含:NoDiskConflict、PodFitsResources、PodSelectorMatches、PodFitsHost、CheckNodeLabelPresence、CheckServiceAffinity 和PodFitsPorts
策略等。
其默认的AlgorithmProvider
加载的预选策略返回布尔值包括:
PodFitsPorts(PodFitsPorts):判断端口是否冲突
PodFitsResources(PodFitsResources):判断备选节点的资源是否满足备选Pod的需求
NoDiskConflict(NoDiskConflict):判断备选节点和已有节点是否磁盘冲突
MatchNodeSelector(PodSelectorMatches):判断备选节点是否包含备选Pod的标签选择器指定的标签。
HostName(PodFitsHost):判断备选Pod的spec.nodeName域所指定的节点名称和备选节点的名称是否一致
即每个节点只有通过前面提及的5个默认预选策略后
,才能初步被选中,进入到确认最优节点(优选策略)流程。
Scheduler中的优选策略包含(不限于下面3个):
LeastRequestedPriority
:从备选节点列表中选出资源消耗最小的节点,对各个节点公式打分
CalculateNodeLabelPriority
:判断策略列出的标签在备选节点中存在时,是否选择该备选节,这不太懂,打分
BalancedResourceAllocation
:从备选节点列表中选出各项资源使用率最均衡的节点。对各个节点公式打分
每个节点通过优先选择策略时都会算出一个得分,计算各项得分,最终选出得分值最大的节点作为优选的结果
(也是调度算法的结果)。
Pod的调度 手动指定pod的运行位置 :通过给node节点设置指定的标签,然后我们可以在创建pod里指定通过选择器选择node标签,类似前端里面DOM操作元素定位,或者直接指定节点名
节点标签常用命令
标签设置
–
查看
kubectl get nodes –show-labels
设置
kubectl label node node2 disktype=ssd
取消
kubectl label node node2 disktype-
所有节点设置
kubectl label node all key=vale
给节点设置标签
1 2 3 4 5 6 ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl label node vms82.liruilongs.github.io disktype=node1 node/vms82.liruilongs.github.io labeled ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl label node vms83.liruilongs.github.io disktype=node2 node/vms83.liruilongs.github.io labeled
查看节点标签
1 2 3 4 5 6 7 8 ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl get node --show-labels NAME STATUS ROLES AGE VERSION LABELS vms81.liruilongs.github.io Ready control-plane,master 45d v1.22.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=vms81.liruilongs.github.io,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers= vms82.liruilongs.github.io Ready <none> 45d v1.22.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=node1,kubernetes.io/arch=amd64,kubernetes.io/hostname=vms82.liruilongs.github.io,kubernetes.io/os=linux vms83.liruilongs.github.io Ready <none> 45d v1.22.2 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=node2,kubernetes.io/arch=amd64,kubernetes.io/hostname=vms83.liruilongs.github.io,kubernetes.io/os=linux ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$
特殊的内置标签node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=
,用于设置角色列roles
1 2 3 4 5 6 ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl get node NAME STATUS ROLES AGE VERSION vms81.liruilongs.github.io Ready control-plane,master 45d v1.22.2 vms82.liruilongs.github.io Ready <none> 45d v1.22.2 vms83.liruilongs.github.io Ready <none> 45d v1.22.2
我们也可以做worker节点上设置标签
1 2 3 4 5 6 ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl label nodes vms82.liruilongs.github.io node-role.kubernetes.io/worker1= node/vms82.liruilongs.github.io labeled ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl label nodes vms83.liruilongs.github.io node-role.kubernetes.io/worker2= node/vms83.liruilongs.github.io labeled
查看标签
1 2 3 4 5 6 7 8 ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl get node NAME STATUS ROLES AGE VERSION vms81.liruilongs.github.io Ready control-plane,master 45d v1.22.2 vms82.liruilongs.github.io Ready worker1 45d v1.22.2 vms83.liruilongs.github.io Ready worker2 45d v1.22.2 ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$
选择器(nodeSelector
)方式 在特定节点上运行pod,给vms83.liruilongs.github.io节点打disktype=node2的标签,yaml文件选择器指定对应的标签
1 2 3 4 5 6 7 8 9 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get nodes -l disktype=node2 NAME STATUS ROLES AGE VERSION vms83.liruilongs.github.io Ready worker2 45d v1.22.2 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$vim pod-node2.yaml ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl apply -f pod-node2.yaml pod/podnode2 created
pod-node2.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: podnode2 name: podnode2 spec: nodeSelector: disktype: node2 containers: - image: nginx imagePullPolicy: IfNotPresent name: podnode2 resources: {} dnsPolicy: ClusterFirst restartPolicy: Always status: {}
pod被调度到了vms83.liruilongs.github.io
1 2 3 4 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES podnode2 1/1 Running 0 13m 10.244.70.60 vms83.liruilongs.github.io <none> <none>
指定节点名称(nodeName
)的方式 1 2 3 4 5 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$vim pod-node1.yaml ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl apply -f pod-node1.yaml pod/podnode1 created
pod-node1.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: podnode1 name: podnode1 spec: nodeName: vms82.liruilongs.github.io containers: - image: nginx imagePullPolicy: IfNotPresent name: podnode1 resources: {} dnsPolicy: ClusterFirst restartPolicy: Always status: {}
需要注意 当pod资源文件指定的节点标签,或者节点名不存在时,这个pod资源是无法创建成功的
1 2 3 4 5 6 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES podnode1 1/1 Running 0 36s 10.244.171.165 vms82.liruilongs.github.io <none> <none> podnode2 1/1 Running 0 13m 10.244.70.60 vms83.liruilongs.github.io <none> <none>
主机亲和性 所谓主机亲和性
,即在满足指定条件的节点上运行。分为硬策略(必须满足)
,软策略(最好满足)
硬策略(requiredDuringSchedulingIgnoredDuringExecution
) 所调度节点的标签必须为其中的一个,这个标签是一个默认标签,会自动添加
1 2 3 4 5 6 7 8 9 10 affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - vms85.liruilongs.github.io - vms84.liruilongs.github.io
1 2 3 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl apply -f pod-node-a.yaml pod/podnodea created
pod-node-a.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: podnodea name: podnodea spec: containers: - image: nginx imagePullPolicy: IfNotPresent name: podnodea resources: {} affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - vms85.liruilongs.github.io - vms84.liruilongs.github.io dnsPolicy: ClusterFirst restartPolicy: Always status: {}
条件不满足,所以 Pending
1 2 3 4 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods NAME READY STATUS RESTARTS AGE podnodea 0/1 Pending 0 8s
我修改一下,修改为存在的标签
1 2 3 4 5 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$sed -i 's/vms84.liruilongs.github.io/vms83.liruilongs.github.io/' pod-node-a.yaml ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl apply -f pod-node-a.yaml pod/podnodea created
可以发现pod调度成功
1 2 3 4 5 6 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES podnodea 1/1 Running 0 13s 10.244.70.61 vms83.liruilongs.github.io <none> <none> ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$
软策略(preferredDuringSchedulingIgnoredDuringExecution
) 所调度节点尽量为其中一个
1 2 3 4 5 6 7 8 9 10 11 affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 2 preference: matchExpressions: - key: kubernetes.io/hostname operator: In values: - vms85.liruilongs.github.io - vms84.liruilongs.github.io
1 2 3 4 5 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$vim pod-node-a-r.yaml ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl apply -f pod-node-a-r.yaml pod/podnodea created
pod-node-a-r.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: podnodea name: podnodea spec: containers: - image: nginx imagePullPolicy: IfNotPresent name: podnodea resources: {} affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 2 preference: matchExpressions: - key: kubernetes.io/hostname operator: In values: - vms85.liruilongs.github.io - vms84.liruilongs.github.io dnsPolicy: ClusterFirst restartPolicy: Always status: {}
检查一下,调度OK
1 2 3 4 5 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES podnodea 1/1 Running 0 28s 10.244.70.62 vms83.liruilongs.github.io <none> <none>
常见的标签运算符
运算符
描述
In
包含自, 比如上面的硬亲和就包含env_role=dev、env_role=test两种标签
NotIn
和上面相反,凡是包含该标签的节点都不会匹配到
Exists
存在里面和In比较类似,凡是有某个标签的机器都会被选择出来。使用Exists的operator的话,values里面就不能写东西了。
Gt
greater than的意思,表示凡是某个value大于设定的值的机器则会被选择出来。
Lt
less than的意思,表示凡是某个value小于设定的值的机器则会被选择出来。
DoesNotExists
不存在该标签的节点
节点的coedon与drain 如果想把某个节点设置为不可用的话,可以对节点实施cordon或者drain
如果一个node被标记为cordon
,新创建的pod不会被调度到此node上,已经调度上去的不会被移走,coedon用于节点的维护,当不希望再节点分配pod,那么可以使用coedon
把节点标记为不可调度。
这里我们为了方便,创建一个Deployment
控制器用去用于演示,deploy拥有3个pod副本
1 2 3 4 5 6 7 ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl create deployment nginx --image=nginx --dry-run=client -o yaml >nginx-dep.yaml ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$vim nginx-dep.yaml ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl apply -f nginx-dep.yaml deployment.apps/nginx created
nginx-dep.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 apiVersion: apps/v1 kind: Deployment metadata: creationTimestamp: null labels: app: nginx name: nginx spec: replicas: 3 selector: matchLabels: app: nginx strategy: {} template: metadata: creationTimestamp: null labels: app: nginx spec: containers: - image: nginx name: nginx imagePullPolicy: IfNotPresent resources: {} status: {}
可以看到pod调度到了两个工作节点
1 2 3 4 5 6 7 8 9 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-7cf7d6dbc8-hx96s 1/1 Running 0 2m16s 10.244.171.167 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-wshxp 1/1 Running 0 2m16s 10.244.70.1 vms83.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-x78x4 1/1 Running 0 2m16s 10.244.70.63 vms83.liruilongs.github.io <none> <none> ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$
节点的coedon
基本命令
1 2 kubectl cordon vms83.liruilongs.github.io kubectl uncordon vms83.liruilongs.github.io
通过cordon
把vms83.liruilongs.github.io
标记为不可调度
1 2 3 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl cordon vms83.liruilongs.github.io node/vms83.liruilongs.github.io cordoned
查看节点状态,vms83.liruilongs.github.io
变成SchedulingDisabled
1 2 3 4 5 6 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get nodes NAME STATUS ROLES AGE VERSION vms81.liruilongs.github.io Ready control-plane,master 48d v1.22.2 vms82.liruilongs.github.io Ready worker1 48d v1.22.2 vms83.liruilongs.github.io Ready,SchedulingDisabled worker2 48d v1.22.2
修改deployment
副本数量 –replicas=6
1 2 3 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl scale deployment nginx --replicas=6 deployment.apps/nginx scaled
新增的pod都调度到了vms82.liruilongs.github.io
节点
1 2 3 4 5 6 7 8 9 10 11 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-7cf7d6dbc8-2nmsj 1/1 Running 0 64s 10.244.171.170 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-chsrn 1/1 Running 0 63s 10.244.171.168 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-hx96s 1/1 Running 0 7m30s 10.244.171.167 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-lppbp 1/1 Running 0 63s 10.244.171.169 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-wshxp 1/1 Running 0 7m30s 10.244.70.1 vms83.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-x78x4 1/1 Running 0 7m30s 10.244.70.63 vms83.liruilongs.github.io <none> <none> ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$
把vms83.liruilongs.github.io
节点上的pod都干掉,会发现新增pod都调度到了vms82.liruilongs.github.io
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl delete pod nginx-7cf7d6dbc8-wshxp pod "nginx-7cf7d6dbc8-wshxp" deleted ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-7cf7d6dbc8-2nmsj 1/1 Running 0 2m42s 10.244.171.170 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-5hnc7 1/1 Running 0 10s 10.244.171.171 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-chsrn 1/1 Running 0 2m41s 10.244.171.168 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-hx96s 1/1 Running 0 9m8s 10.244.171.167 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-lppbp 1/1 Running 0 2m41s 10.244.171.169 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-x78x4 1/1 Running 0 9m8s 10.244.70.63 vms83.liruilongs.github.io <none> <none> ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl delete pod nginx-7cf7d6dbc8-x78x4 pod "nginx-7cf7d6dbc8-x78x4" deleted
pod都位于正常的节点
1 2 3 4 5 6 7 8 9 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-7cf7d6dbc8-2nmsj 1/1 Running 0 3m31s 10.244.171.170 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-5hnc7 1/1 Running 0 59s 10.244.171.171 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-chsrn 1/1 Running 0 3m30s 10.244.171.168 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-hx96s 1/1 Running 0 9m57s 10.244.171.167 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-lppbp 1/1 Running 0 3m30s 10.244.171.169 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-m8ltr 1/1 Running 0 30s 10.244.171.172 vms82.liruilongs.github.io <none> <none>
通过 uncordon
恢复节点vms83.liruilongs.github.io
状态
1 2 3 4 5 6 7 8 9 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl uncordon vms83.liruilongs.github.io node/vms83.liruilongs.github.io uncordoned ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get nodes NAME STATUS ROLES AGE VERSION vms81.liruilongs.github.io Ready control-plane,master 48d v1.22.2 vms82.liruilongs.github.io Ready worker1 48d v1.22.2 vms83.liruilongs.github.io Ready worker2 48d v1.22.2
删除所有的pod
1 2 3 4 5 6 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl scale deployment nginx --replicas=0 deployment.apps/nginx scaled ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -o wide No resources found in liruilong-pod-create namespace.
节点的drain
如果一个节点被设置为drain
,则此节点不再被调度pod
,且此节点上已经运行的pod会被驱逐(evicted
)到其他节点
drain包含两种状态:cordon不可被调度,evicted驱逐当前节点所以pod
1 2 kubectl drain vms83.liruilongs.github.io --ignore-daemonsets kubectl uncordon vms83.liruilongs.github.io
通过deployment
添加4个nginx副本--replicas=4
1 2 3 4 5 6 7 8 9 10 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl scale deployment nginx --replicas=4 deployment.apps/nginx scaled ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -o wide --one-output NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-7cf7d6dbc8-2clnb 1/1 Running 0 22s 10.244.171.174 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-9p6g2 1/1 Running 0 22s 10.244.70.2 vms83.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-ptqxm 1/1 Running 0 22s 10.244.171.173 vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-zmdqm 1/1 Running 0 22s 10.244.70.4 vms83.liruilongs.github.io <none> <none>
添加一下drain 将节点vms82.liruilongs.github.io
设置为drain
1 2 3 4 5 6 7 8 9 10 11 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl drain vms82.liruilongs.github.io --ignore-daemonsets --delete-emptydir-data node/vms82.liruilongs.github.io cordoned WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-ntm7v, kube-system/kube-proxy-nzm24 evicting pod liruilong-pod-create/nginx-7cf7d6dbc8-ptqxm evicting pod kube-system/metrics-server-bcfb98c76-wxv5l evicting pod liruilong-pod-create/nginx-7cf7d6dbc8-2clnb pod/nginx-7cf7d6dbc8-2clnb evicted pod/nginx-7cf7d6dbc8-ptqxm evicted pod/metrics-server-bcfb98c76-wxv5l evicted node/vms82.liruilongs.github.io evicted
查看节点,状态为不可被调度
1 2 3 4 5 6 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get nodes NAME STATUS ROLES AGE VERSION vms81.liruilongs.github.io Ready control-plane,master 48d v1.22.2 vms82.liruilongs.github.io Ready,SchedulingDisabled worker1 48d v1.22.2 vms83.liruilongs.github.io Ready worker2 48d v1.22.2
等一会,查看节点调度,所有pod调度到了vms83.liruilongs.github.io这台机器
1 2 3 4 5 6 7 8 9 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -o wide --one-output NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-7cf7d6dbc8-9p6g2 1/1 Running 0 4m20s 10.244.70.2 vms83.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-hkflr 1/1 Running 0 25s 10.244.70.5 vms83.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-qt48k 1/1 Running 0 26s 10.244.70.7 vms83.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-zmdqm 1/1 Running 0 4m20s 10.244.70.4 vms83.liruilongs.github.io <none> <none> ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$
取消drain:kubectl uncordon vms82.liruilongs.github.io
1 2 3 4 5 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl uncordon vms82.liruilongs.github.io node/vms82.liruilongs.github.io uncordoned ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$
有时候会报错
将节点vms82.liruilongs.github.io
设置为drain
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl drain vms82.liruilongs.github.io node/vms82.liruilongs.github.io cordoned DEPRECATED WARNING: Aborting the drain command in a list of nodes will be deprecated in v1.23. The new behavior will make the drain command go through all nodes even if one or more nodes failed during the drain. For now, users can try such experience via: --ignore-errors error: unable to drain node "vms82.liruilongs.github.io" , aborting command ... There are pending nodes to be drained: vms82.liruilongs.github.io cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-system/calico-node-ntm7v, kube-system/kube-proxy-nzm24 cannot delete Pods with local storage (use --delete-emptydir-data to override): kube-system/metrics-server-bcfb98c76-wxv5l ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get nodes NAME STATUS ROLES AGE VERSION vms81.liruilongs.github.io Ready control-plane,master 48d v1.22.2 vms82.liruilongs.github.io Ready,SchedulingDisabled worker1 48d v1.22.2 vms83.liruilongs.github.io Ready worker2 48d v1.22.2
uncordon掉刚才的节点
1 2 3 4 5 6 7 8 9 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl uncordon vms82.liruilongs.github.io node/vms82.liruilongs.github.io uncordoned ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get nodes NAME STATUS ROLES AGE VERSION vms81.liruilongs.github.io Ready control-plane,master 48d v1.22.2 vms82.liruilongs.github.io Ready worker1 48d v1.22.2 vms83.liruilongs.github.io Ready worker2 48d v1.22.2
节点taint及pod的tolerations 默认情况下,pod是不会调度到有污点的节点,master节点从来没有调度到pod,因为master节点设置了污点,如果想要在某个被设置了污点的机器调度pod,那么pod需要设置tolerations(容忍污点)才能够被运行。
master节点的污点
1 2 3 4 5 ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$ansible master -m shell -a "kubectl describe nodes vms81.liruilongs.github.io | grep -E '(Roles|Taints)'" 192.168.26.81 | CHANGED | rc=0 >> Roles: control-plane,master Taints: node-role.kubernetes.io/master:NoSchedule
taint(污点)的设置和查看 查看节点角色,和是否设置污点
1 2 3 4 ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl describe nodes vms82.liruilongs.github.io | grep -E '(Roles|Taints)' Roles: worker1 Taints: <none>
给 vms83.liruilongs.github.io
节点设置污点,指定key为key83
1 2 3 4 5 6 7 8 9 10 11 12 13 ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl describe nodes vms83.liruilongs.github.io | grep -E '(Roles|Taints)' Roles: worker2 Taints: <none> ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl taint node vms83.liruilongs.github.io key83=:NoSchedule node/vms83.liruilongs.github.io tainted ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl describe nodes vms83.liruilongs.github.io | grep -E '(Roles|Taints)' Roles: worker2 Taints: key83:NoSchedule ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$
重新通过deployment 创建pod,会发现pod都调度到82上面,因为83设置了污点
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl scale deployment nginx --replicas=0 deployment.apps/nginx scaled ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl scale deployment nginx --replicas=4 deployment.apps/nginx scaled ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -o wide --one-output NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-7cf7d6dbc8-dhst5 0/1 ContainerCreating 0 12s <none> vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-j6g25 0/1 ContainerCreating 0 12s <none> vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-wpnhr 0/1 ContainerCreating 0 12s <none> vms82.liruilongs.github.io <none> <none> nginx-7cf7d6dbc8-zkww8 0/1 ContainerCreating 0 11s <none> vms82.liruilongs.github.io <none> <none> ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl delete deployment nginx deployment.apps "nginx" deleted
取消污点设置
1 2 3 4 5 6 7 8 9 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl taint node vms83.liruilongs.github.io key83- node/vms83.liruilongs.github.io untainted ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl describe nodes vms83.liruilongs.github.io | grep -E '(Roles|Taints)' Roles: worker2 Taints: <none> ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$
设置operator的值为Equal 如果需要在有污点的节点上运行pod,那么需要在定义pod的时候指定toleration属性
给 vms82.liruilongs.github.io
节点设置污点,指定key为key82,value为val82
1 2 3 4 5 6 7 8 9 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl taint nodes vms82.liruilongs.github.io key82=val82:NoSchedule node/vms82.liruilongs.github.io tainted ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl describe nodes vms82.liruilongs.github.io | grep -E '(Roles|Taints)' Roles: worker1 Taints: key82=val82:NoSchedule ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$
在设置节点taint的时候,如果value的值不为空,在pod里的tolerations字段只能写Equal
,不能写Exists
,
修改yaml文件 pod-taint3.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: pod1 name: pod1 spec: nodeSelector: disktype: node2 tolerations: - key: "key82" operator: "Equal" value: "val82" effect: "NoSchedule" containers: - image: nginx name: pod1 resources: {} dnsPolicy: ClusterFirst restartPolicy: Always status: {}
创建pod后,执行成功,容忍污点
1 2 3 4 5 6 7 8 9 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl apply -f pod-taint3.yaml pod/pod1 created ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod1 1/1 Running 0 11s 10.244.171.180 vms82.liruilongs.github.io <none> <none> ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$
设置operator的值为Exists 如果使用Exists的话,那么pod中不能写value
设置 vms83.liruilongs.github.io
节点污点标记,value为空
1 2 3 4 5 6 7 8 9 10 ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl taint node vms83.liruilongs.github.io key83=:NoSchedule node/vms83.liruilongs.github.io tainted ┌──[root@vms81.liruilongs.github.io]-[~/ansible] └─$kubectl describe nodes vms83.liruilongs.github.io | grep -E '(Roles|Taints)' Roles: worker2 Taints: key83:NoSchedule ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl apply -f pod-taint.yaml pod/pod1 created
pod-taint.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: pod1 name: pod1 spec: nodeSelector: disktype: node2 tolerations: - key: "key83" operator: "Exists" effect: "NoSchedule" containers: - image: nginx name: pod1 resources: {} dnsPolicy: ClusterFirst restartPolicy: Always status: {}
会发现节点调度到了有污点的vms83.liruilongs.github.io
节点
1 2 3 4 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod1 1/1 Running 0 3m4s 10.244.70.8 vms83.liruilongs.github.io <none> <none>
当然,value没有值也可以这样使用Equal
1 2 3 4 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$cp pod-taint.yaml pod-taint2.yaml ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$vim pod-taint2.yaml
pod-taint2.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: pod1 name: pod1 spec: nodeSelector: disktype: node2 tolerations: - key: "key83" operator: "Equal" value: "" effect: "NoSchedule" containers: - image: nginx name: pod1 resources: {} dnsPolicy: ClusterFirst restartPolicy: Always status: {}
会发现节点还是调度到了有污点的vms83.liruilongs.github.io
节点
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl delete -f pod-taint.yaml pod "pod1" deleted ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl apply -f pod-taint2.yaml pod/pod1 created ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod1 0/1 ContainerCreating 0 8s <none> vms83.liruilongs.github.io <none> <none> ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$kubectl taint nodes vms83.liruilongs.github.io key83- node/vms83.liruilongs.github.io untainted ┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-pod-create] └─$
关于pod调度和小伙伴们分享到这里.生活加油 ^_^